Skip to content

fix(docker): Pre-install Playwright Chromium browsers for automated t…#745

Merged
Shironex merged 6 commits intov0.15.0rcfrom
fix/docker-playwright-missing-browsers
Feb 15, 2026
Merged

fix(docker): Pre-install Playwright Chromium browsers for automated t…#745
Shironex merged 6 commits intov0.15.0rcfrom
fix/docker-playwright-missing-browsers

Conversation

@Shironex
Copy link
Collaborator

@Shironex Shironex commented Feb 2, 2026

Fixes #725

AI agents in automated testing mode require Playwright to verify implementations, but Docker containers had only system dependencies installed, not browser binaries. This caused verification failures with permissions errors.

Changes:

  • Install Playwright Chromium in Dockerfile (~300MB increase)
  • Update docker-compose.override.yml.example with clearer Playwright documentation
  • Add "Playwright for Automated Testing" section to README
  • Document optional volume mount for persisting browsers across rebuilds

Browsers are now pre-installed and work out of the box for Docker users.

Summary by CodeRabbit

  • New Features

    • Playwright Chromium pre-installed in the container for immediate E2E testing.
    • New "Adaptive" thinking level and adaptive-model behavior exposed in the UI and defaults.
    • New Codex model (GPT‑5.3‑Codex) and updated model defaults; authentication now reports auth type (oauth, api_key, cli).
  • Documentation

    • README and docs updated with Playwright E2E guidance, persistence tips, docker-compose snippet, and troubleshooting.
  • Tests

    • Unit tests added/updated to cover adaptive thinking and credential detection.
  • Chores

    • .gitignore extended to ignore additional local data files.

…esting

Fixes #725

AI agents in automated testing mode require Playwright to verify implementations,
but Docker containers had only system dependencies installed, not browser binaries.
This caused verification failures with permissions errors.

Changes:
- Install Playwright Chromium in Dockerfile (~300MB increase)
- Update docker-compose.override.yml.example with clearer Playwright documentation
- Add "Playwright for Automated Testing" section to README
- Document optional volume mount for persisting browsers across rebuilds

Browsers are now pre-installed and work out of the box for Docker users.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Pre-installs Playwright Chromium in the Docker image and documents optional browser cache; updates model mappings (Claude Opus → 4.6, adds GPT‑5.3‑Codex), adds adaptive thinking support and thinking-level UI handling, expands detailed CLI/OAuth credential detection and reporting, and updates tests and related UI/electron APIs.

Changes

Cohort / File(s) Summary
Docker / Playwright
Dockerfile, docker-compose.override.yml.example, .gitignore
Installs Playwright Chromium during image build (runs install as automaker then reverts to root), ensures /home/automaker/.local/bin is on PATH for interactive shells, adds optional Playwright cache volume example and docs; adds data/.mcp.json and data/.planning to .gitignore.
Docs & Examples
README.md, docs/server/*, docs/llm-shared-packages.md, libs/model-resolver/README.md, docker-compose.override.yml.example
Adds Playwright E2E testing instructions, documents pre-installed Chromium and optional persistent cache with mount/reinstall guidance, and updates example model strings to new Opus/Codex mappings.
Server: auth, startup & SDK options
apps/server/src/index.ts, apps/server/src/routes/setup/routes/verify-claude-auth.ts, apps/server/src/lib/sdk-options.ts, apps/server/src/providers/claude-provider.ts, apps/server/package.json
Detects CLAUDE_CODE_OAUTH_TOKEN, captures detailed cliAuthIndicators (via platform helpers), exposes authType in verify response, adjusts thinking-token logic for adaptive levels, and updates Claude model metadata to Opus 4.6; adds Playwright devDependency.
Platform: paths & credential checks
libs/platform/src/system-paths.ts, libs/platform/src/index.ts, libs/platform/tests/oauth-credential-detection.test.ts
Adds FileCheckResult/DirectoryCheckResult types, Windows NVM CLI path handling, and detailed per-path checks in getClaudeAuthIndicators; adds extensive unit tests covering OAuth/API key/edge cases.
Types & thinking-level helpers
libs/types/src/settings.ts, libs/types/src/model.ts, libs/types/src/model-display.ts, libs/types/src/codex-models.ts, libs/types/src/index.ts, libs/types/src/event.ts
Adds adaptive thinking level, isAdaptiveThinkingModel and getThinkingLevelsForModel helpers, updates canonical mappings (Opus→4.6), adds Codex GPT‑5.3 model, and extends exported types/events.
UI: thinking-levels, auth UX & wiring
apps/ui/src/components/.../thinking-level-selector.tsx, apps/ui/src/components/.../phase-model-selector.tsx, apps/ui/src/components/.../model-constants.ts, apps/ui/src/components/views/setup-view/steps/claude-setup-step.tsx, apps/ui/src/lib/electron.ts, apps/ui/src/lib/http-api-client.ts, apps/ui/src/components/dialogs/sandbox-risk-dialog.tsx
Surfaces adaptive option and dynamic thinking-level lists tied to models, auto-adjusts thinkingLevel on model change, adds authType handling to setup UI and electron/http API types, and inserts Docker troubleshooting copy.
Providers & model lists
apps/server/src/providers/codex-models.ts, apps/server/src/providers/provider-factory.ts, apps/server/tests/unit/providers/*.test.ts
Adds GPT‑5.3‑Codex model entry, updates Codex descriptions, and replaces Opus 4.5 references with 4.6 across provider code and tests.
Tests
apps/server/tests/unit/*, libs/model-resolver/tests/*, libs/platform/tests/*
Updates tests to expect Opus→claude-opus-4-6, adds adaptive-thinking unit tests (ensure maxThinkingTokens unset), and introduces a large credential-detection test suite.

Sequence Diagram(s)

(Skipped)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

Documentation

Poem

🐇 I hopped through Dockerfiles with nimble paws,
Poured Chromium in so tests obey the laws.
A cache for browsers, snug in a named volume bed,
Agents now click happily, no errors to dread.
Carrots for CI, and cheers for passing heads!

🚥 Pre-merge checks | ✅ 3 | ❌ 3
❌ Failed checks (3 warnings)
Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The PR implements the primary requirement from #725 (pre-installing Playwright Chromium in Docker) but has a critical blocker: the Dockerfile playwright install commands fail because @playwright/test is missing from apps/server package.json. PR #751 adds this dependency, which is necessary for the changes to work. Merge PR #751 first to add @playwright/test to apps/server devDependencies, or cherry-pick that commit into this PR before merging to ensure Docker builds succeed.
Out of Scope Changes check ⚠️ Warning The PR includes several changes beyond the Playwright browser pre-installation scope: Claude Opus 4.5→4.6 model upgrades, new Codex 5.3 model, adaptive thinking implementation, OAuth detection improvements, and platform authentication refactoring. Separate the Playwright Docker/documentation changes from the unrelated model version, thinking level, and authentication changes into distinct PRs for clearer review and deployment.
Docstring Coverage ⚠️ Warning Docstring coverage is 63.16% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: pre-installing Playwright Chromium browsers in the Docker image for automated testing, which directly addresses the primary objective.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/docker-playwright-missing-browsers

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Shironex, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses issue #725 by pre-installing Playwright Chromium in the Docker image, enabling automated testing without requiring additional setup. It also includes documentation updates and configuration examples to guide users on how to persist the browser cache across container rebuilds.

Highlights

  • Playwright Installation: Playwright Chromium is now pre-installed in the Docker image to support automated testing, resolving issues with missing browser binaries and permission errors.
  • Documentation Updates: The README.md has been updated with a new section on "Playwright for Automated Testing," including instructions for persisting browsers across rebuilds.
  • Configuration Example: The docker-compose.override.yml.example file has been updated with clearer documentation on Playwright and instructions for persisting the browser cache.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@Shironex Shironex self-assigned this Feb 2, 2026
@Shironex Shironex added the Bug Something isn't working label Feb 2, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request pre-installs Playwright Chromium browsers in the Docker image to enable automated testing out of the box. The changes include updating the Dockerfile to install the browser, and modifying the README and docker-compose example file to document this feature and the optional persistence of the browser cache. My review focuses on improving the Docker layer caching and clarifying the documentation around using a persistent volume for the Playwright cache to avoid user confusion.

- Clean up npx cache after Playwright installation to reduce image size
- Clarify README: volume mounts persist cache across container lifecycles,
  not image rebuilds
- Add first-use warning: empty volume overrides pre-installed browsers,
  users must re-install with docker exec command

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@docker-compose.override.yml.example`:
- Around line 24-30: Update the manual Playwright update instruction so it runs
inside the automaker-server container as the automaker user with working
directory /app (so Playwright installs to /home/automaker/.cache/ms-playwright
instead of root); specifically replace the current example command that runs as
root with one that executes the install command in the automaker-server
container using user "automaker" and working directory "/app".

In `@Dockerfile`:
- Around line 122-126: The Playwright install RUN (the command "npx playwright
install chromium") must be moved so it runs after the COPY --from=server-builder
/app/node_modules ./node_modules step; update the Dockerfile to remove the early
RUN and add a new RUN after the node_modules copy that invokes the locally
installed playwright (e.g., use ./node_modules/.bin/playwright install chromium
or npx --no-install playwright install chromium) so the pinned version from
node_modules/package-lock.json is used and no registry fetch occurs.

In `@README.md`:
- Line 347: Replace the bold subsection title "**Optional: Persist browsers
across container rebuilds**" with a proper Markdown heading (e.g., "###
Optional: Persist browsers across container rebuilds" or another appropriate
heading level) so it stops being flagged by markdownlint MD036; locate the exact
bold text in README.md and change it to the chosen heading syntax.
- Around line 362-366: Update the README's manual browser update command so it
runs as the automaker user and from /app: change the documented docker exec
invocation that currently runs npx playwright install chromium in the container
(automaker-server) to include running as user "automaker" and setting the
working directory to "/app" (i.e. add the equivalent of the docker exec flags to
set user and workdir) so that the Playwright browser cache is written to the
automaker user's cache directory rather than root.

- Move Playwright install after node_modules copy to use pinned version
- Use local playwright binary instead of npx to avoid registry fetch
- Add --user automaker -w /app flags to docker exec commands
- Change bold text to proper heading in README (MD036 lint fix)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@Shironex Shironex added the Testers-Requested Request for others to test an enhancement or bug fix/etc. label Feb 2, 2026
@JasonBroderick
Copy link

Tested this PR - found an issue and submitted a fix

I tested this PR on WSL2 Ubuntu and found that the Playwright install step fails during Docker build:

#63 [server server 16/20] RUN ./node_modules/.bin/playwright install chromium
#63 0.248 /bin/sh: 1: ./node_modules/.bin/playwright: not found

Root cause: The server-builder stage only copies apps/server/package.json, so playwright (which is only in apps/ui/package.json devDependencies) isn't available in the server's node_modules.

Fix: I've submitted #751 which adds @playwright/test to the server's devDependencies. After this fix, the build works correctly and Playwright verification runs successfully in Automaker.

Feel free to:

Happy to help however is most useful!

Shironex and others added 2 commits February 15, 2026 17:55
The Dockerfile's playwright install step requires the binary in
node_modules/.bin/, but playwright was only a UI dependency. This adds
@playwright/test to server devDependencies so the Docker build can
successfully run `./node_modules/.bin/playwright install chromium`.

Fixes the "playwright: not found" error during Docker image build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/server/utilities.md (1)

20-20: ⚠️ Potential issue | 🟡 Minor

Update documentation to reflect current Claude model versions.

The CLAUDE_MODEL_MAP documentation at line 215 is outdated. It shows sonnet: 'claude-sonnet-4-20250514', but the actual code in libs/types/src/model.ts has sonnet: 'claude-sonnet-4-5-20250929'. Both CLAUDE_CANONICAL_MAP and CLAUDE_MODEL_MAP in the code consistently map to the same current version, so update the docs to match.

Current code (libs/types/src/model.ts)
export const CLAUDE_MODEL_MAP: Record<string, string> = {
  haiku: 'claude-haiku-4-5-20251001',
  sonnet: 'claude-sonnet-4-5-20250929',
  opus: 'claude-opus-4-6',
} as const;

export const CLAUDE_CANONICAL_MAP: Record<ClaudeCanonicalId, string> = {
  'claude-haiku': 'claude-haiku-4-5-20251001',
  'claude-sonnet': 'claude-sonnet-4-5-20250929',
  'claude-opus': 'claude-opus-4-6',
} as const;
🤖 Fix all issues with AI agents
In `@apps/server/src/routes/setup/routes/verify-claude-auth.ts`:
- Around line 324-345: The response may leave authType undefined when authMethod
is absent; update the authenticated branch to derive authType from the effective
method (treat missing authMethod as 'api_key' like earlier logic) so authType is
always set when authenticated; specifically, in the block that sets authType
(variables: authType, authenticated, authMethod) use the effective method value
and for the 'cli' case still call getClaudeAuthIndicators() and fall back to
'cli' on error, then return that authType in the res.json response.

In `@apps/ui/src/lib/agent-context-parser.ts`:
- Around line 35-44: The formatter formatModelName currently maps 'claude-opus'
to "Opus 4.5" because the check uses model.includes('opus') before the 4.6 case;
update formatModelName so that 'claude-opus' (and any other alias that should be
4.6) is handled by the 4.6 branch—e.g., check for 'claude-opus' (or include
'claude-opus' in the same condition as 'opus-4-6') before the generic 'opus'
check so the function returns "Opus 4.6" for that alias.
🧹 Nitpick comments (6)
libs/types/src/model-display.ts (1)

146-153: Consider placing 'adaptive' before the manual levels for clarity.

adaptive is semantically different from the manual intensity levels (low → ultrathink) — it's a mode where the model decides. Placing it last after ultrathink could imply it's "more" than ultrathink. Consider placing it second (after none) to separate modes from manual levels, matching the order returned by getThinkingLevelsForModel in settings.ts which returns ['none', 'adaptive'].

apps/server/tests/unit/lib/sdk-options.test.ts (1)

495-517: The thinkingLevel: 'none' test is misplaced in the "adaptive thinking" describe block.

The test at lines 507–516 verifies thinkingLevel: 'none' behavior, which isn't specific to adaptive thinking or Opus 4.6. It duplicates the pattern already tested elsewhere (e.g., createSpecGenerationOptions at lines 368–377) and is misleadingly scoped under "adaptive thinking for Opus 4.6". Consider removing it from this block or moving it to the createAutoModeOptions describe block above if coverage for 'none' with createAutoModeOptions is desired.

libs/types/src/settings.ts (1)

251-269: isAdaptiveThinkingModel uses fragile string matching — consider documenting the limitation.

The check model.includes('opus-4-6') won't match future opus versions (e.g., opus-4-7), despite the JSDoc claiming "Opus 4.6+". Also, model === 'claude-opus' only matches the exact canonical ID — not the alias 'opus' or the resolved string 'claude-opus-4-6' (though the includes branch covers the latter).

This is fine for now, but consider either:

  1. Updating the JSDoc to say "Opus 4.6" (not "4.6+") to match the actual behavior, or
  2. Using a broader pattern like model.includes('opus') if all future opus models will support adaptive thinking.
📝 Suggested JSDoc fix (minimal)
 /**
- * Check if a model uses adaptive thinking (Opus 4.6+)
+ * Check if a model uses adaptive thinking (Opus 4.6)
  * Adaptive thinking models let the SDK decide token allocation automatically.
  */
apps/server/src/providers/claude-provider.ts (1)

253-253: Truthy check on maxThinkingTokens will silently discard 0.

...(maxThinkingTokens && { maxThinkingTokens }) uses a truthy check. If getThinkingTokenBudget ever returns 0 (e.g., a new thinking level), it would be silently dropped. Use an explicit undefined check instead.

Proposed fix
-      ...(maxThinkingTokens && { maxThinkingTokens }),
+      ...(maxThinkingTokens !== undefined && { maxThinkingTokens }),
apps/ui/src/components/views/settings-view/providers/codex-model-configuration.tsx (1)

29-59: Consider deriving CODEX_MODEL_INFO from CODEX_MODEL_CONFIG_MAP to avoid duplication.

This map duplicates id, label, and description already defined in CODEX_MODEL_CONFIG_MAP from libs/types/src/codex-models.ts. Any future model addition or description change must be updated in both places. You could derive the UI info from the shared type source:

import { CODEX_MODEL_CONFIG_MAP } from '@automaker/types';

const CODEX_MODEL_INFO: Record<CodexModelId, CodexModelInfo> = Object.fromEntries(
  Object.entries(CODEX_MODEL_CONFIG_MAP).map(([key, config]) => [
    key,
    { id: config.id, label: config.label, description: config.description },
  ])
) as Record<CodexModelId, CodexModelInfo>;
apps/server/src/routes/setup/routes/verify-claude-auth.ts (1)

324-339: Move authType derivation into a setup service.
The filesystem-based inspection is business logic; keeping the route thin will align better with the server architecture. As per coding guidelines: "Server business logic should be organized into services in the services/ directory, with Express route handlers in routes/ that delegate to services."

Comment on lines 324 to 345
// Determine specific auth type for success messages
let authType: 'oauth' | 'api_key' | 'cli' | undefined;
if (authenticated) {
if (authMethod === 'api_key') {
authType = 'api_key';
} else if (authMethod === 'cli') {
// Check if CLI auth is via OAuth (Claude Code subscription) or generic CLI
try {
const indicators = await getClaudeAuthIndicators();
authType = indicators.credentials?.hasOAuthToken ? 'oauth' : 'cli';
} catch {
// Fall back to generic CLI if credential check fails
authType = 'cli';
}
}
}

res.json({
success: true,
authenticated,
authType,
error: errorMessage || undefined,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

authType stays undefined when authMethod is omitted.
The handler treats a missing authMethod as api_key earlier, but this block only sets authType when authMethod is explicitly provided. Derive it from the effective method to keep responses consistent.

🛠️ Suggested fix
-      let authType: 'oauth' | 'api_key' | 'cli' | undefined;
-      if (authenticated) {
-        if (authMethod === 'api_key') {
+      const effectiveAuthMethod = authMethod ?? 'api_key';
+      let authType: 'oauth' | 'api_key' | 'cli' | undefined;
+      if (authenticated) {
+        if (effectiveAuthMethod === 'api_key') {
           authType = 'api_key';
-        } else if (authMethod === 'cli') {
+        } else if (effectiveAuthMethod === 'cli') {
           // Check if CLI auth is via OAuth (Claude Code subscription) or generic CLI
           try {
             const indicators = await getClaudeAuthIndicators();
             authType = indicators.credentials?.hasOAuthToken ? 'oauth' : 'cli';
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Determine specific auth type for success messages
let authType: 'oauth' | 'api_key' | 'cli' | undefined;
if (authenticated) {
if (authMethod === 'api_key') {
authType = 'api_key';
} else if (authMethod === 'cli') {
// Check if CLI auth is via OAuth (Claude Code subscription) or generic CLI
try {
const indicators = await getClaudeAuthIndicators();
authType = indicators.credentials?.hasOAuthToken ? 'oauth' : 'cli';
} catch {
// Fall back to generic CLI if credential check fails
authType = 'cli';
}
}
}
res.json({
success: true,
authenticated,
authType,
error: errorMessage || undefined,
// Determine specific auth type for success messages
const effectiveAuthMethod = authMethod ?? 'api_key';
let authType: 'oauth' | 'api_key' | 'cli' | undefined;
if (authenticated) {
if (effectiveAuthMethod === 'api_key') {
authType = 'api_key';
} else if (effectiveAuthMethod === 'cli') {
// Check if CLI auth is via OAuth (Claude Code subscription) or generic CLI
try {
const indicators = await getClaudeAuthIndicators();
authType = indicators.credentials?.hasOAuthToken ? 'oauth' : 'cli';
} catch {
// Fall back to generic CLI if credential check fails
authType = 'cli';
}
}
}
res.json({
success: true,
authenticated,
authType,
error: errorMessage || undefined,
🤖 Prompt for AI Agents
In `@apps/server/src/routes/setup/routes/verify-claude-auth.ts` around lines 324 -
345, The response may leave authType undefined when authMethod is absent; update
the authenticated branch to derive authType from the effective method (treat
missing authMethod as 'api_key' like earlier logic) so authType is always set
when authenticated; specifically, in the block that sets authType (variables:
authType, authenticated, authMethod) use the effective method value and for the
'cli' case still call getClaudeAuthIndicators() and fall back to 'cli' on error,
then return that authType in the res.json response.

Comment on lines 35 to 44
export function formatModelName(model: string): string {
// Claude models
if (model.includes('opus-4-6')) return 'Opus 4.6';
if (model.includes('opus')) return 'Opus 4.5';
if (model.includes('sonnet')) return 'Sonnet 4.5';
if (model.includes('haiku')) return 'Haiku 4.5';

// Codex/GPT models - specific formatting
if (model === 'codex-gpt-5.3-codex') return 'GPT-5.3 Codex';
if (model === 'codex-gpt-5.2-codex') return 'GPT-5.2 Codex';
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Display label for claude-opus still shows 4.5.
claude-opus is treated as Opus 4.6 elsewhere, but this formatter returns “Opus 4.5” for that alias. Consider handling the alias in the 4.6 branch to keep labels consistent.

🛠️ Suggested fix
-  if (model.includes('opus-4-6')) return 'Opus 4.6';
+  if (model.includes('opus-4-6') || model === 'claude-opus') return 'Opus 4.6';
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export function formatModelName(model: string): string {
// Claude models
if (model.includes('opus-4-6')) return 'Opus 4.6';
if (model.includes('opus')) return 'Opus 4.5';
if (model.includes('sonnet')) return 'Sonnet 4.5';
if (model.includes('haiku')) return 'Haiku 4.5';
// Codex/GPT models - specific formatting
if (model === 'codex-gpt-5.3-codex') return 'GPT-5.3 Codex';
if (model === 'codex-gpt-5.2-codex') return 'GPT-5.2 Codex';
export function formatModelName(model: string): string {
// Claude models
if (model.includes('opus-4-6') || model === 'claude-opus') return 'Opus 4.6';
if (model.includes('opus')) return 'Opus 4.5';
if (model.includes('sonnet')) return 'Sonnet 4.5';
if (model.includes('haiku')) return 'Haiku 4.5';
// Codex/GPT models - specific formatting
if (model === 'codex-gpt-5.3-codex') return 'GPT-5.3 Codex';
if (model === 'codex-gpt-5.2-codex') return 'GPT-5.2 Codex';
🤖 Prompt for AI Agents
In `@apps/ui/src/lib/agent-context-parser.ts` around lines 35 - 44, The formatter
formatModelName currently maps 'claude-opus' to "Opus 4.5" because the check
uses model.includes('opus') before the 4.6 case; update formatModelName so that
'claude-opus' (and any other alias that should be 4.6) is handled by the 4.6
branch—e.g., check for 'claude-opus' (or include 'claude-opus' in the same
condition as 'opus-4-6') before the generic 'opus' check so the function returns
"Opus 4.6" for that alias.

Copy link
Collaborator Author

@Shironex Shironex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitChorus AI Review

This is a well-structured PR that accomplishes multiple goals: (1) Docker Playwright pre-installation for automated testing, (2) Opus 4.6 model support with adaptive thinking, (3) new GPT-5.3-Codex model, (4) enhanced OAuth credential detection, and (5) improved debugging/troubleshooting UX.

Strengths:

  • The adaptive thinking implementation is clean and thorough - it correctly flows through settings.tssdk-options.tsclaude-provider.ts with proper handling at each layer
  • The credential detection enhancement with FileCheckResult/DirectoryCheckResult types adds valuable debuggability
  • Good fix for the credential search to continue past empty/tokenless credential files (the prior behavior would stop at an empty {} file)
  • Comprehensive test coverage with 761 lines of OAuth credential detection tests
  • The UI auto-normalizes thinking level when switching between adaptive and non-adaptive models (add-feature-dialog.tsx)
  • SDK dependency upgrades are pinned appropriately

Concerns:

  • Duplicate COPY libs/spec-parser/package*.json in Dockerfile (harmless but messy)
  • Missing trailing newline in .gitignore
  • The Dockerfile RUN command for Playwright has misleading || echo error handling

The PR is large but coherent. All the moving parts (types, server, UI, Docker) are updated consistently. The code quality is high with good documentation, tests, and defensive error handling. The title says "fix(docker)" but the scope is much broader than just Docker - this is really a multi-feature PR that also includes model upgrades and auth improvements.

Quality Score: 8/10 ⭐⭐⭐⭐⭐⭐⭐⭐☆☆

Findings Summary

# Severity Category Finding Location
1 🟡 Minor Codebase-fit Duplicate COPY libs/spec-parser in Dockerfile Dockerfile:28
2 🟡 Minor Logic Misleading error handling in Playwright install RUN command Dockerfile:155
3 🟡 Minor Logic Long paths silently truncated from the right in warning box index.ts:214
4 🟡 Minor Logic ThinkingLevelSelector defaults to non-adaptive levels when model is omitted thinking-level-selector.tsx:22
5 🔵 Nit Style Missing trailing newline in .gitignore .gitignore:99
6 🔵 Nit Performance Redundant filesystem checks after successful auth verification verify-claude-auth.ts:332
7 🔵 Nit Style Verbose debug logging with inconsistent prefix style index.ts:126

via GitChorus

Comments Not Placed Inline

The following findings could not be placed as inline comments (line not in diff):

  • Dockerfile:28 — Line 28 not in diff for "Dockerfile"

.gitignore Outdated
data/
.codex/
.mcp.json
.planning No newline at end of file
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

Nit - Style: Missing trailing newline in .gitignore

The .gitignore file now lacks a trailing newline. POSIX standard defines a line as ending with a newline character. Many tools warn about this, and git itself flags it in the diff. This was likely introduced when the .planning entry was added.

Problematic code:

+.planning\ No newline at end of file

Suggested fix:

Add a trailing newline after the last line:

.mcp.json
.planning

# This adds ~300MB to the image but enables automated testing mode out of the box
# Using the locally installed playwright ensures we use the pinned version from package-lock.json
USER automaker
RUN ./node_modules/.bin/playwright install chromium && \
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Minor - Logic: Misleading error handling in Playwright install RUN command

The || echo at the end is connected to the ls command due to operator precedence, not to the entire chain. If playwright install chromium fails, the build will fail (which is correct). However, if ls fails (unlikely but possible), the echo silently masks the error. This creates a misleading success message. More importantly, the || structure makes the overall RUN step always succeed even if ls fails, which could hide a misconfigured browser install path.

Problematic code:

RUN ./node_modules/.bin/playwright install chromium && \
    echo "=== Playwright Chromium installed ===" && \
    ls -la /home/automaker/.cache/ms-playwright/ || echo "Playwright browsers installed"

Suggested fix:

Use explicit grouping or just remove the fallback since the ls output is informational:
```dockerfile
RUN ./node_modules/.bin/playwright install chromium && \
    echo "=== Playwright Chromium installed ===" && \
    ls -la /home/automaker/.cache/ms-playwright/

pathsCheckedInfo = `
║ ║
║ ${'Paths checked:'.padEnd(BOX_CONTENT_WIDTH)}║
${pathsChecked.map((p) => `║ ${p.substring(0, BOX_CONTENT_WIDTH - 4).padEnd(BOX_CONTENT_WIDTH - 4)} ║`).join('\n')}`;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Minor - Logic: Long paths silently truncated from the right in warning box

File paths in credential checks (e.g., /home/user/.claude/.credentials.json) can be quite long, especially in Docker or CI environments. The substring(0, BOX_CONTENT_WIDTH - 4) = substring(0, 63) will silently truncate longer paths. A label like Credentials: /home/very-long-username/.claude/.credentials.json (54+ chars) with the prefix would lose the end of the path. While the box formatting requires truncation, it would be more useful to truncate from the beginning (showing the filename) rather than the end.

Problematic code:

pathsChecked.map((p) => `║    ${p.substring(0, BOX_CONTENT_WIDTH - 4).padEnd(BOX_CONTENT_WIDTH - 4)}  ║`).join('\n')

Suggested fix:

Consider showing the end of the path rather than the beginning when it's too long:
```typescript
const maxLen = BOX_CONTENT_WIDTH - 4;
const display = p.length > maxLen ? '...' + p.substring(p.length - maxLen + 3) : p;
pathsChecked.map((p) => {
  const display = p.length > maxLen ? '...' + p.slice(-(maxLen - 3)) : p;
  return `║    ${display.padEnd(maxLen)}  ║`;
}).join('\n')

testIdPrefix = 'thinking-level',
model,
}: ThinkingLevelSelectorProps) {
const levels = getThinkingLevelsForModel(model || '');
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Minor - Logic: ThinkingLevelSelector defaults to non-adaptive levels when model is omitted

When model is not provided (undefined), the fallback is an empty string. getThinkingLevelsForModel('') returns the full manual thinking levels (none, low, medium, high, ultrathink) which excludes 'adaptive'. However, the ThinkingLevelSelector component is currently not imported anywhere in the codebase, so this is a latent issue. When it eventually gets used, callers must remember to pass the model prop for adaptive thinking to work correctly with Opus 4.6.

Problematic code:

const levels = getThinkingLevelsForModel(model || '');

Suggested fix:

This is acceptable as-is since the component isn't yet used, but consider documenting the required model prop more prominently:
```tsx
/** Model ID is required for correct thinking level filtering.
 *  Without it, adaptive thinking won't be available for Opus 4.6. */
model?: string;

} else if (authMethod === 'cli') {
// Check if CLI auth is via OAuth (Claude Code subscription) or generic CLI
try {
const indicators = await getClaudeAuthIndicators();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

Nit - Performance: Redundant filesystem checks after successful auth verification

After successfully verifying authentication via a full SDK query (which already proved auth works), this code calls getClaudeAuthIndicators() again just to determine the auth type. This performs additional filesystem reads (checking settings files, stats cache, projects dir, credential files) on every successful CLI auth verification. While not a significant performance issue, the indicator data was already fetched during the startup check. Consider caching or reusing that data.

Problematic code:

const indicators = await getClaudeAuthIndicators();

Suggested fix:

This is acceptable for now since it only runs during setup verification (not a hot path). Could be improved by caching the indicators or passing them through context, but low priority.

const hasAnthropicKey = !!process.env.ANTHROPIC_API_KEY;
const hasEnvOAuthToken = !!process.env.CLAUDE_CODE_OAUTH_TOKEN;

logger.debug('[CREDENTIAL_CHECK] Starting credential detection...');
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

Nit - Style: Verbose debug logging with inconsistent prefix style

The new debug logging in the credential check is verbose - there are 6 debug log calls added throughout the auth detection flow. While helpful for debugging, the [CREDENTIAL_CHECK] prefix is inconsistent with other logging patterns in the file (which don't use bracketed prefixes). This is a minor style concern.

Problematic code:

logger.debug('[CREDENTIAL_CHECK] Starting credential detection...');
logger.debug('[CREDENTIAL_CHECK] Environment variables:', {
  hasAnthropicKey,
  hasEnvOAuthToken,
});

Suggested fix:

Consider using a consistent prefix format, e.g., using the logger's category instead:
```typescript
const credLogger = createLogger('CredentialCheck');
credLogger.debug('Starting credential detection...');

Or keep it as-is since it's debug-level only and useful for troubleshooting.

- .gitignore: add missing trailing newline
- Dockerfile: remove misleading || echo fallback in Playwright install
- index.ts: truncate long paths from beginning instead of end in warning box
- verify-claude-auth.ts: use effectiveAuthMethod to prevent undefined authType
- agent-context-parser.ts: handle claude-opus alias as Opus 4.6
- thinking-level-selector.tsx: improve model prop documentation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Shironex Shironex changed the base branch from main to v0.15.0rc February 15, 2026 17:17
@Shironex Shironex merged commit 1662c6b into v0.15.0rc Feb 15, 2026
6 checks passed
@Shironex Shironex deleted the fix/docker-playwright-missing-browsers branch February 15, 2026 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Something isn't working Testers-Requested Request for others to test an enhancement or bug fix/etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Docker: Playwright verification fails - browsers not installed

2 participants